.. _long_analysis: Longitudinal Data Analysis =========================== .. |pid ppfadl| raw:: html "pid" .. |pid ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/pid}{\textbf{"pid"}} .. |hid ppfadl| raw:: html "hid" .. |hid ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/hid}{\textbf{"hid"}} .. |syear ppfadl| raw:: html "syear" .. |syear ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/syear}{\textbf{"syear"}} .. |netto ppfadl| raw:: html "netto" .. |netto ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/netto}{\textbf{"netto"}} .. |phrf ppfadl| raw:: html "phrf" .. |phrf ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/phrf}{\textbf{"phrf"}} .. |migback ppfadl| raw:: html "migback" .. |migback ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/migback}{\textbf{"migback"}} .. |sex ppfadl| raw:: html "sex" .. |sex ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/ppathl/sex}{\textbf{"sex"}} .. |pgfamstd pgen| raw:: html "pgfamstd" .. |pgfamstd pgen2| raw:: latex \href{https://paneldata.org/soep-core/data/pgen/pgfamstd}{\textbf{"pgfamstd"}} .. |plh0182 ppfadl| raw:: html "plh0182" .. |plh0182 ppfadl2| raw:: latex \href{https://paneldata.org/soep-core/data/pl/plh0182}{\textbf{"plh0182"}} Simple cross-sectional analyses show that married people have higher life satisfaction than singles. You want to check this on the basis of longitudinal analysis with the SOEP. **Create an exercise path with four subfolders:** .. figure:: png/uebungspfade.png :align: center **Example:** - H:/material/exercises/do - H:/material/exercises/output - H:/material/exercises/temp - H:/material/exercises/log These are used to store your script, log files, datasets, and temporary datasets. Open an empty do-file and define the paths you created with globals: .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 17-25 The global "AVZ" defines the main path. The main paths are subdivided using the globals "MY_IN_PATH", "MY_DO_FILES", "MY_LOG_OUT", "MY_OUT_DATA", "MY_OUT_TEMP". The global "MY_IN_PATH" contains the path to your ordered data. **Create a master file that uses the important variables from ppathl.** You should always add some variables from PPATHL to your dataset by default. Download the following information from PPATHL: - Individual identifier |pid ppfadl| |pid ppfadl2| - Household identifier |hid ppfadl| |pid ppfadl2| - Survey year |syear ppfadl| |syear ppfadl2| - The net variable with information on the interview type |netto ppfadl| |netto ppfadl2| - The weighting variable |phrf ppfadl| |phrf ppfadl2| - The gender of the person |sex ppfadl| |sex ppfadl2| - The migration background |migback ppfadl| |migback ppfadl2| .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 28-30 **Search for matching variables and add them to your dataset** To perform your analysis, you need different SOEP variables. The SOEP offers various options for a variable search: - Search the questionnaires for useful variables. (for more information, see the section :ref:`quest_search`) - Find a suitable variable via the topic list of paneldata.org (for more information, see the section :ref:`topic`) - Search for a suitable variable using a search term in paneldata.org (for more information, see the section :ref:`var_search`) - Use the documentation provided on the generated variables (for more information, see the section :ref:`documentation`) In this case, we use the variables |pgfamstd pgen| |pgfamstd pgen2| (martial status) and |plh0182 ppfadl| |plh0182 ppfadl2| (life satisfaction). .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 35-40 Clean and inspect the data ---------------------------- Encode all missing values to system missing. Since you are interested in individual characteristics in your analysis: Delete all measurements that are not based on successful individual interviews. .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 44-47 .. figure:: png/SOEPlong_01.png :align: center **How many people contribute measurements and what is the proportion of people contributing at least 10 waves in a row?** Define the dataset as a panel dataset. .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 50-51 .. figure:: png/SOEPlong_02.png :align: center 105,068 respondents have contributed information in waves a (1984) to bk (2020) and 75% of the 105,068 respondents have provided information for at least 10 waves. **How many people took part in the survey in 2010 and contributed to continuous measurements up to 2014?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 52 .. figure:: png/SOEPlong_03.png :align: center 14,673 respondents provided continuous information from 2010 to 2014. Univariate inspection & analysis --------------------------------- **How does the mean of life satisfaction change over time?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 61-62 .. figure:: png/SOEPlong_04.png :align: center **What proportion of people are a) married in 2014 or b) have a migration background? Compare weighted with unweighted frequency tables: Who is overrepresented in SOEP?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 64-66 .. figure:: png/SOEPlong_05.png :align: center .. figure:: png/SOEPlong_06.png :align: center The data show that married people are overrepresented in the SOEP and single people are underrepresented. The weighting makes it representative again for Germany. .. figure:: png/SOEPlong_05b.png :align: center .. figure:: png/SOEPlong_07.png :align: center In the SOEP sample, respondents with a direct or indirect migration background are overrepresented. **How many of those persons who reported a life satisfaction scale value of 7 in one survey year also indicated the scale value of 7 in the following survey year?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 68 .. figure:: png/SOEPlong_08.png :align: center 34.57% of the respondents who reported a life satisfaction of 7 again reported a value of 7 in the following year. **Is it more likely that a highly dissatisfied person (value: 0) will be less dissatisfied the following year or that a very satisfied (value: 10) person will be less satisfied the following year?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 68 .. figure:: png/SOEPlong_08.png :align: center The rows reflect the initial values, and the columns reflect the final values. Around 20% of those who were completely dissatisfied (value: 0) in the base year remained completely dissatisfied in the following year. About 80% of these completely dissatisfied people from the base year were more satisfied in the following year. Of the completely satisfied persons (value: 10), about 37% remained just as satisfied in the following year, but 63% became less satisfied. It is more likely that a completely dissatisfied person will become more satisfied in the following year than that a completely satisfied person will become less satisfied. **Which transitions in marital status can be observed particularly frequently in the data?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 69 .. figure:: png/SOEPlong_09.png :align: center Survey respondents who were married but lived separated [value 2] in the base year and reported divorce as their family status in the following year [value 4] can be observed particularly frequently (about 19%). Simple cross sectional analyses -------------------------------- You now want to find the correlation between marital status and life satisfaction. Is there an effect of marriage on life satisfaction? And if so, is it a sustained effect? **First, calculate the correlation between family status and life satisfaction from a cross-sectional perspective for 2010: Are married people happier than singles?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 74 .. figure:: png/SOEPlong_10.png :align: center At first glance, married couples seem happier than singles. Now generate a variable that indicates a transition from "single" to "married". **How many such transitions can you find in the data?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 77-79 .. figure:: png/SOEPlong_11.png :align: center A total of 5,559 people can be observed changing status from single to married. **What is the average level of life satisfaction immediately after the transition to marriage (i.e., in the first survey in which the transition can be observed) and how high is life satisfaction immediately before the transition to marriage?** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 81-86 .. figure:: png/SOEPlong_12.png :align: center Before the transition to marriage, the average life satisfaction of the respondents is 7.59. In the following year, that is, after the transition to marriage, the average life satisfaction of the respondents is 7.69. It can be seen that with the transition to marriage, average life satisfaction rises slightly by 0.10. **Map the complete satisfaction history around the "marriage entry" event [3 years before; 3 years after].** .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 92-101 .. figure:: png/SOEPlong_14.png :align: center Choose a suitable presentation for your results and let Stata create a graphic. .. literalinclude:: docs/SOEPlong.do :linenos: :lines: 105-124 .. figure:: png/SOEPlong_15.png :align: center The graph shows that a positive effect on life satisfaction can be observed when family status changes from single to married. In the following years of the existing marriage, life satisfaction decreases again and approaches the initial satisfaction before the marriage. Last change: |today|